Syntactic Processing of Unknown Words
نویسنده
چکیده
A method for processing sentences which contain unknown words, i. e. words for which no lexical entry exists, is presented. There are three different stages of processing: 1. The sentence with the unknown word is parsed. There are no special requirements for the parsing algorithm, but the lexical lookup procedure needs to be modified. 2. Based on the syntactic structure of the parse, information about the unknown word can be extracted. 3. The information obtained in step 2 may be too fully specified for a lexical entry. Therefore a filter is applied to it to create a new lexical entry. An application of the method is illustrated with examples from Categorial Unification Grammar. The problem of using the extracted information for lexical knowledge acquisition is discussed.
منابع مشابه
Analysis of Unknown Lexical Items using Morphological and Syntactic Information with the TIMIT Corpus
The importance of dealing with unknown words in Natural Language Processing NLP is growing as NLP systems are used in more and more applications One aid in predicting the lexical class of words that do not appear in the lexicon referred to as unknown words is the use of syntactic parsing rules The distinction between closed class and open class words together with morphological recognition appe...
متن کاملMorpho-syntactic tagging system based on the patterns words for arabic texts
Text tagging is a very important tool for various applications in natural language processing, namely the morphological and syntactic analysis of texts, indexation and information retrieval, "vocalization" of Arabic texts, and probabilistic language model (n-class model). However, these systems based on the lexemes of limited size, are unable to treat unknown words consequently. To overcome thi...
متن کاملبرچسبزنی خودکار نقشهای معنایی در جملات فارسی به کمک درختهای وابستگی
Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...
متن کاملPost Mortem Parsing with Unknown Lexical Items using Morphological Recognition Syntactic Information and a Closed Class Lexicon
The importance of dealing with unknown words in natural language processing NLP is growing as NLP systems are used in more and more applications The ability to parse sentences containing unknown words will make a parsing system more robust and exible The use of syntactic parsing rules provides constraints on the possible lexical categories of unknown words A lexicon of closed class words also o...
متن کاملTuning an Existing Nomenclature for Specific Domain Corpora: A Syntax-Based Similarity Method
There is a constant need to extend and tune medical vocabularies to account for new words and new word usages. Robust natural language processing (NLP) tools can be applied to medical texts corpora such as patient narratives and help collect and analyze unknown words1,2. The aim of the present work is to assess the potential for classifying unknown words based on the semantic categories of “nei...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IWBS Report
دوره 131 شماره
صفحات -
تاریخ انتشار 1990